Fix test_vllm_npu_worker_class_resolves: tolerate version mismatch#1
Open
UsernameFull wants to merge 109 commits into
Open
Fix test_vllm_npu_worker_class_resolves: tolerate version mismatch#1UsernameFull wants to merge 109 commits into
UsernameFull wants to merge 109 commits into
Conversation
Co-Authored-By: chengengru.cgr <chengengru.cgr@taobao.com> Co-Authored-By: fengjingxuan.fjx <fengjingxuan.fjx@alibaba-inc.com> Co-Authored-By: ft498870 <ft498870@taobao.com> Co-Authored-By: heyancheng.hyc <heyancheng.hyc@taobao.com> Co-Authored-By: hongzhen.yj <hongzhen.yj@alibaba-inc.com> Co-Authored-By: huangju.hj <huangju.hj@alibaba-inc.com> Co-Authored-By: jiamang.wang <jiamang.wang@alibaba-inc.com> Co-Authored-By: scott.lxy <scott.lxy@taobao.com> Co-Authored-By: shenjingyu.sjy <shenjingyu.sjy@alibaba-inc.com> Co-Authored-By: shenliao.sla <shenliao.sla@taobao.com> Co-Authored-By: tianhe.lzd <tianhe.lzd@alibaba-inc.com> Co-Authored-By: weixun.wwx <weixun.wwx@alibaba-inc.com> Co-Authored-By: wzy496492 <wzy496492@alibaba-inc.com> Co-Authored-By: xiongshaopan.xsp <xiongshaopan.xsp@alibaba-inc.com> Co-Authored-By: xuehuanran.xhr <xuehuanran.xhr@alibaba-inc.com> Co-Authored-By: zhaohaizhou.zhz <zhaohaizhou.zhz@alibaba-inc.com> Co-Authored-By: bzd02333762 <bzd02333762@alibaba-inc.com> Co-authored-by: beiyue.lj <beiyue.lj@alibaba-inc.com>
Co-Authored-By: lt511297 lt511297@alibaba-inc.com
Co-Authored-By: lt511297 <lt511297@alibaba-inc.com>
Removed the call to upload checkpoint to MOS after saving.
to correct `group_size` instead of `gropu_size`
…emove rlvr_math_vlm_pipeline
Previously, is_last_step was passed via **kwargs and transparently forwarded to DeepSpeedEngine.save_checkpoint(), which does not accept this argument, causing a TypeError at checkpoint time. Fix by explicitly declaring is_last_step=None in the signature (consistent with megatron_strategy and fsdp2_strategy), and applying the same async_upload guard logic as the other strategies. Signed-off-by: Xuchun Shang <xuchun.shang@linux.alibaba.com>
- Fix socket resource leak in get_node_ip() by properly closing socket - Replace list comprehension with proper loop in destroy_placement_group() for better error handling Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…endency on ray._private.ray_constants.
…llout chat_template function typo.
3822431 to
33fd25c
Compare
Add CPU and Ascend NPU CI workflows, vLLM/SGLang NPU compatibility fixes, and CI-stable test adaptations.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Test fix for version incompatibility between vllm_ascend and expected import path.